Skip to content

Create data pipeline: gold_user_engagement#6

Open
zingleai wants to merge 1 commit intomainfrom
dp/finalpr1
Open

Create data pipeline: gold_user_engagement#6
zingleai wants to merge 1 commit intomainfrom
dp/finalpr1

Conversation

@zingleai
Copy link
Copy Markdown
Owner

@zingleai zingleai commented Feb 23, 2026

Summary

AI-assisted pipeline: User engagement metrics by channel (7-day window) for Marketing dashboard. Includes active users, sessions, and avg session duration.

Tags: engagement, marketing
Criticality: P1

Models (1)

Model Layer Materialization Columns Upstream
gold_user_engagement table 0

Lineage

graph LR; stg_user_activity --> gold_user_engagement
Loading

Data Quality Tests (2)

  • [gold_user_engagement] not_null_channel:
  • [gold_user_engagement] positive_active_users:

Generated by Data Portal

Summary by CodeRabbit

  • New Features
    • New user engagement analytics now available, providing metrics such as active users count, total sessions, and average session duration organized by channel for the last 7 days.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 23, 2026

📝 Walkthrough

Walkthrough

A new dbt model gold_user_engagement is introduced to aggregate user engagement metrics by channel. The model computes active user counts, total sessions, and average session durations from staging activity data, filtered to the last 7 days and grouped by channel.

Changes

Cohort / File(s) Summary
User Engagement Model
dbt/models/marts/gold_user_engagement.sql
New mart model added to compute aggregated engagement metrics (active users, total sessions, average session duration) by channel over a 7-day rolling window from the staging activity table.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A shiny new model hops into view,
Channel metrics bundled, aggregated anew,
Seven days of engagement, grouped with care,
User sessions counted, dashboards laid bare! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Create data pipeline: gold_user_engagement' directly and specifically describes the main change — adding a new dbt model for user engagement metrics.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dp/finalpr1

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
dbt/models/marts/gold_user_engagement.sql (1)

10-10: Replace positional GROUP BY 1 with an explicit column reference.

Ordinal references silently break if the SELECT list is reordered, making the model fragile for future edits.

♻️ Proposed fix
-GROUP BY 1
+GROUP BY channel
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dbt/models/marts/gold_user_engagement.sql` at line 10, The model uses a
positional GROUP BY ("GROUP BY 1") which is fragile; open
dbt/models/marts/gold_user_engagement.sql, identify the SELECT list first
expression (the column or expression that the ordinal 1 refers to, e.g., user_id
or event_date) and replace "GROUP BY 1" with an explicit "GROUP BY
<that_column_or_expression>" (if multiple ordinals were intended, list each
actual column name). Ensure you use the exact column identifier from the SELECT
(including any alias) so the GROUP BY matches the SELECT output.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dbt/models/marts/gold_user_engagement.sql`:
- Line 9: The WHERE clause using activity_date >= DATEADD(day, -7, CURRENT_DATE)
is off-by-one and yields 8 days (includes today); update the filter for the
intended 7-day window by either (a) adding an upper bound activity_date <
CURRENT_DATE to get 7 complete days excluding today, or (b) shifting the lower
bound to DATEADD(day, -6, CURRENT_DATE) to include today as the 7th day—modify
the WHERE clause referencing activity_date, DATEADD and CURRENT_DATE accordingly
to match the business requirement.
- Line 9: Replace the warehouse-specific DATEADD usage in the WHERE clause with
dbt's cross-warehouse macro: locate the WHERE filtering on activity_date that
currently uses DATEADD(day, -7, CURRENT_DATE) and call the dbt.dateadd macro
instead (keep the same datepart "day", interval -7, and CURRENT_DATE as the date
argument) so the compiled SQL will work across adapters; update the expression
that references activity_date and CURRENT_DATE accordingly.
- Around line 1-10: The model gold_user_engagement is missing a dbt config block
so it will default to the project-level materialization (likely view) instead of
the intended table; add a top-of-file {{ config(materialized='table') }} block
(before the SELECT) in the gold_user_engagement model to explicitly set
materialized: table, leaving the rest of the query (including
ref('stg_user_activity')) unchanged.

---

Nitpick comments:
In `@dbt/models/marts/gold_user_engagement.sql`:
- Line 10: The model uses a positional GROUP BY ("GROUP BY 1") which is fragile;
open dbt/models/marts/gold_user_engagement.sql, identify the SELECT list first
expression (the column or expression that the ordinal 1 refers to, e.g., user_id
or event_date) and replace "GROUP BY 1" with an explicit "GROUP BY
<that_column_or_expression>" (if multiple ordinals were intended, list each
actual column name). Ensure you use the exact column identifier from the SELECT
(including any alias) so the GROUP BY matches the SELECT output.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1fca229 and a3bd794.

📒 Files selected for processing (1)
  • dbt/models/marts/gold_user_engagement.sql

Comment on lines +1 to +10
-- Model: gold_user_engagement
-- User engagement metrics by channel for Marketing dashboard
SELECT
channel,
COUNT(DISTINCT user_id) AS active_users,
SUM(sessions) AS total_sessions,
AVG(session_duration_minutes) AS avg_session_duration
FROM {{ ref('stg_user_activity') }}
WHERE activity_date >= DATEADD(day, -7, CURRENT_DATE)
GROUP BY 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing {{ config() }} block — model will not materialize as a table.

The PR metadata specifies materialized: table, but there is no {{ config() }} block in the file. Without it, dbt uses the project-level default (typically view), silently mismatching the intended materialisation.

🔧 Proposed fix
+{{ config(materialized='table') }}
+
 -- Model: gold_user_engagement  
 -- User engagement metrics by channel for Marketing dashboard
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dbt/models/marts/gold_user_engagement.sql` around lines 1 - 10, The model
gold_user_engagement is missing a dbt config block so it will default to the
project-level materialization (likely view) instead of the intended table; add a
top-of-file {{ config(materialized='table') }} block (before the SELECT) in the
gold_user_engagement model to explicitly set materialized: table, leaving the
rest of the query (including ref('stg_user_activity')) unchanged.

SUM(sessions) AS total_sessions,
AVG(session_duration_minutes) AS avg_session_duration
FROM {{ ref('stg_user_activity') }}
WHERE activity_date >= DATEADD(day, -7, CURRENT_DATE)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Off-by-one: >= DATEADD(day, -7, CURRENT_DATE) spans 8 days, not 7.

With today as the anchor, DATEADD(day, -7, CURRENT_DATE) resolves to 7 days ago. Using >= therefore includes that anchor day plus 7 subsequent days (including today) — a total of 8 days. Additionally, aggregating today's still-accumulating data skews the daily metrics for the Marketing dashboard.

For 7 complete days (exclusive of today):

WHERE activity_date >= DATEADD(day, -7, CURRENT_DATE)
  AND activity_date < CURRENT_DATE

Or, for a rolling 7-day window ending today inclusive but exactly 7 days:

WHERE activity_date >= DATEADD(day, -6, CURRENT_DATE)

Choose whichever matches the business requirement.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dbt/models/marts/gold_user_engagement.sql` at line 9, The WHERE clause using
activity_date >= DATEADD(day, -7, CURRENT_DATE) is off-by-one and yields 8 days
(includes today); update the filter for the intended 7-day window by either (a)
adding an upper bound activity_date < CURRENT_DATE to get 7 complete days
excluding today, or (b) shifting the lower bound to DATEADD(day, -6,
CURRENT_DATE) to include today as the 7th day—modify the WHERE clause
referencing activity_date, DATEADD and CURRENT_DATE accordingly to match the
business requirement.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

dbt dateadd macro documentation

💡 Result:

dateadd is now documented as a dbt Core cross-database macro (namespace: dbt), not in dbt_utils (those cross-db macros were removed from dbt_utils starting in v1.0). [1] (docs.getdbt.com)

dbt dateadd macro (Core)

Signature

{{ dbt.dateadd(datepart="day", interval=1, from_date_or_timestamp="..." ) }}

Args

  • datepart: a date/time part (database-specific)
  • interval: integer (positive or negative)
  • from_date_or_timestamp: a date/time expression [1] (docs.getdbt.com)

Examples

{{ dbt.dateadd(datepart="day", interval=1, from_date_or_timestamp="'2016-03-09'") }}
{{ dbt.dateadd(datepart="month", interval=-2, from_date_or_timestamp="'2016-03-09'") }}
``` [1] ([docs.getdbt.com](https://docs.getdbt.com/reference/dbt-jinja-functions/cross-database-macros))

### Sources
- [1] dbt Developer Hub — “Cross-database macros” (includes `dateadd`) ([docs.getdbt.com](https://docs.getdbt.com/reference/dbt-jinja-functions/cross-database-macros))  
- [2] dbt-utils README note about cross-database macros moving to dbt Core ([github.com](https://github.com/dbt-labs/dbt-utils?utm_source=openai))

Citations:

- 1: https://docs.getdbt.com/reference/dbt-jinja-functions/cross-database-macros
- 2: https://docs.getdbt.com/reference/dbt-jinja-functions/cross-database-macros
- 3: https://docs.getdbt.com/reference/dbt-jinja-functions/cross-database-macros
- 4: https://docs.getdbt.com/reference/dbt-jinja-functions/cross-database-macros
- 5: https://github.com/dbt-labs/dbt-utils?utm_source=openai

---

🌐 Web query:

`DATEADD function Snowflake Redshift PostgreSQL BigQuery compatibility`

💡 Result:

### `DATEADD` compatibility overview (Snowflake vs Redshift vs PostgreSQL vs BigQuery)

| Engine | Has `DATEADD`? | Add *N* units to a date/time |
|---|---:|---|
| **Snowflake** | Yes | `DATEADD(part, value, expr)` (aliases: `TIMEADD`, `TIMESTAMPADD`) [1] |
| **Amazon Redshift** | Yes | `DATEADD(datepart, interval, expr)` [2] |
| **PostgreSQL** | **No** (built-in) | Use arithmetic: `timestamp + interval '1 day'`, `date + integer` (days), etc. [5] |
| **BigQuery (GoogleSQL)** | **No** (`DATEADD` name) | Use typed functions: `DATE_ADD(date, INTERVAL n part)`, `DATETIME_ADD(...)`, `TIMESTAMP_ADD(...)`, `TIME_ADD(...)` [3][4] |

---

## Syntax mappings

### Snowflake → Redshift (nearly 1:1)
Both accept the same 3-argument pattern: `(part, number, expression)` [1][2].

```sql
-- Snowflake / Redshift
DATEADD(day, 7, some_ts)

Snowflake / Redshift → BigQuery

BigQuery uses *_ADD(expr, INTERVAL n PART) (note the INTERVAL clause and PART position) [3][4]. Google’s Snowflake migration guide summarizes this mapping as:

  • DATEADD(part, value, dateOrTime)DATE_ADD(date, INTERVAL value part) [6]
-- BigQuery
DATE_ADD(some_date, INTERVAL 7 DAY)
TIMESTAMP_ADD(some_ts, INTERVAL 7 DAY)

Snowflake / Redshift → PostgreSQL

PostgreSQL typically uses + interval (and for date, + integer adds days) [5].

-- PostgreSQL
some_ts + interval '7 day'
some_date + 7

Key behavioral gotchas (portability)

  • Return type differences when input is DATE and you add sub-day units

    • Snowflake: adding hour/minute/second to a DATE returns a timestamp (TIMESTAMP_NTZ) starting at midnight [1].
    • Redshift: DATEADD returns TIMESTAMP for date inputs (date becomes timestamp at 00:00:00) [2].
  • Month/quarter/year end-of-month behavior

    • BigQuery: for MONTH/QUARTER/YEAR, if the target month is shorter, the result becomes the last day of that month [3].
    • Snowflake: warns that year/quarter/month can shift the day-of-month when the result month has fewer days [1].
    • PostgreSQL interval math explicitly uses “last day of month” when needed while applying the months field of an interval [5].
  • BigQuery TIMESTAMP_ADD supports only up to DAY

    • TIMESTAMP_ADD(..., INTERVAL n DAY/HOUR/MINUTE/...) is supported, but not MONTH/QUARTER/YEAR for TIMESTAMP_ADD [4]. (Use DATETIME_ADD, or convert to DATETIME/DATE where appropriate.)

Sources: Snowflake DATEADD [1], Redshift DATEADD [2], BigQuery DATE_ADD [3], BigQuery TIMESTAMP_ADD [4], PostgreSQL date/time operators & interval rules [5], BigQuery Snowflake translation guide [6].


Use the {{ dbt.dateadd() }} macro instead of DATEADD for cross-warehouse compatibility.

DATEADD(datepart, interval, date) is native to Snowflake and Redshift but PostgreSQL does not have a built-in DATEADD function, and BigQuery uses DATE_ADD/DATETIME_ADD/TIMESTAMP_ADD. dbt provides a cross-warehouse {{ dbt.dateadd() }} macro that compiles to the correct dialect for each adapter.

🔧 Proposed fix using dbt's built-in macro
-WHERE activity_date >= DATEADD(day, -7, CURRENT_DATE)
+WHERE activity_date >= {{ dbt.dateadd(datepart="day", interval=-7, from_date_or_timestamp=current_date) }}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
WHERE activity_date >= DATEADD(day, -7, CURRENT_DATE)
WHERE activity_date >= {{ dbt.dateadd(datepart="day", interval=-7, from_date_or_timestamp="CURRENT_DATE") }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dbt/models/marts/gold_user_engagement.sql` at line 9, Replace the
warehouse-specific DATEADD usage in the WHERE clause with dbt's cross-warehouse
macro: locate the WHERE filtering on activity_date that currently uses
DATEADD(day, -7, CURRENT_DATE) and call the dbt.dateadd macro instead (keep the
same datepart "day", interval -7, and CURRENT_DATE as the date argument) so the
compiled SQL will work across adapters; update the expression that references
activity_date and CURRENT_DATE accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant